Maximum Entropy Models and Prepositional Phrase Ambiguity

نویسنده

  • Mark McLauchlan
چکیده

Prepositional phrases are a common source of ambiguity in natural language and many approaches have been devised to resolve this ambiguity automatically. In particular, several different machine learning approaches have now reached accuracy rates of around 84.5% on the benchmark dataset. Maximum entropy (maxent) models, despite their successful application in many other areas of natural language processing, have only reached 83.7% accuracy on this task. This dissertation shows that maxent models can achieve accuracy rates of 84.7% using standard features, rising to 85.3% when using additional features based on Latent Semantic Analysis (LSA) information. Three feature selection techniques are compared in this domain: frequency cut-off, information gain (mutual information) and a new method that uses the variation of feature weights across several training sets. A simple frequency cut-off is found to be the most robust method across of a variety of models. We also consider ensembles of maxent classifiers created using bagging and noisy bagging, a more effective variant where additional random noise is added to each training set. The latter approach creates a more diverse set of classifiers and a maxent ensemble using noisy bagging achieves 85.53% accuracy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Maximum Entropy Model for Prepositional Phrase Attachment

For this example, a human annotator's attachment decision, which for our purposes is the "correct" attachment, is to the noun phrase. We present in this paper methods for constructing statistical models for computing the probability of attachment decisions. These models could be then integrated into scoring the probability of an overall parse. We present our methods in the context of prepositio...

متن کامل

Reduction of Maximum Entropy Models to Hidden Markov Models

Maximum Entropy (maxent) models are an attractive formalism for statistical models of many types and have been used for a number of purposes, including language modeling (Rosenfeld 1994), part of speech tagging (Ratnaparkhi 1996), prepositional phrase attachment (Ratnaparkhi 1998), sentence breaking (Reynar and Ratnaparkhi 1997) and parsing (Ratnaparkhi 1997). Maxent models allow the combinatio...

متن کامل

Prepositional Phrase Attachment Ambiguity Resolution Using Semantic Hierarchies

This paper describes a system that resolves prepositional phrase attachment ambiguity in English sentence processing. This attachment problem is ubiquitous in English text, and is widely known as a place where semantics determines syntactic form. The decision is made based on a four-tuple composed of the head verb of the verb phrase, the head noun of the noun phrase, and the preposition and hea...

متن کامل

Leveraging a Semantically Annotated Corpus to Disambiguate Prepositional Phrase Attachment

Accurate parse ranking requires semantic information, since a sentence may have many candidate parses involving common syntactic constructions. In this paper, we propose a probabilistic framework for incorporating distributional semantic information into a maximum entropy parser. Furthermore, to better deal with sparse data, we use a modified version of Latent Dirichlet Allocation to smooth the...

متن کامل

A Rule-Based and MT-Oriented Approach to Prepositional Phrase Attachment

Prepositional Phrase is the key issue in structural ambiguity. Recently, researches in corpora provide the lexical cue of prepositions with other words and the information could be used to partly resolve ambiguity resulted from prepositional phrases. Two possible attachments are considered in the literature: either noun attachment or verb attachment. In this paper, we consider the problem from ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001